In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term. The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can actually benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor, could lead to better crop yields, the freeing up of human inolvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.
The aim of this project is to Build a Convolutional Neural Netowrk to classify plant seedlings into their respective categories.
The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has recently released a dataset containing images of unique plants belonging to 12 different species.
Due to the large volume of data, the images were converted to the images.npy file and the labels are also put into Labels.csv, so that you can work on the data/project seamlessly without having to worry about the high data volume.
The goal of the project is to create a classifier capable of determining a plant's species from an image.
List of Species
# Installing the libraries with the specified version.
# uncomment and run the following line if Google Colab is being used
# !pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.3 opencv-python==4.8.0.76 -q --user
# Installing the libraries with the specified version.
# uncomment and run the following lines if Jupyter Notebook is being used
#!pip install tensorflow==2.13.0 scikit-learn==1.2.2 seaborn==0.11.1 matplotlib==3.3.4 numpy==1.24.3 pandas==1.5.2 opencv-python==4.8.0.76 -q --user
Note: After running the above cell, kindly restart the notebook kernel and run all cells sequentially from the start again.
import os
import numpy as np # Importing numpy for Matrix Operations
import pandas as pd # Importing pandas to read CSV files
import matplotlib.pyplot as plt # Importting matplotlib for Plotting and visualizing images
import math # Importing math module to perform mathematical operations
import cv2 # Importing openCV for image processing
import seaborn as sns # Importing seaborn to plot graphs
import random
# Tensorflow modules
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD # Importing the optimizers which can be used in our model
from sklearn import preprocessing # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix # Importing confusion_matrix to plot the confusion matrix
# Display images using OpenCV
from google.colab.patches import cv2_imshow # Importing cv2_imshow from google.patches to display images
# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
# Uncomment and run the below code if you are using google colab
from google.colab import drive
drive.mount('/content/drive')
# Load the image file of the dataset
images = np.load('/content/drive/MyDrive/Python/images.npy')
# Load the labels file of the dataset
labels = pd.read_csv('/content/drive/MyDrive/Python/Labels.csv')
print(images.shape)
There are 4750 images in the dataset, each of shape 128x128 and 3 Color channels(RGB)
print(labels.shape)
there are 4750 labels in the dataset, each label representing a single value.
#Plotting an image from the dataset using OpenCV
cv2_imshow(images[10])
#Plotting an image from the dataset using matplotlib
plt.imshow(images[10])
We can observe that the images are being shown in different colors when plotted with openCV and matplotlib as OpenCV reads images in BGR format and this shows that the given numpy arrays were generated from the original images using OpenCV. These will need to be converted from BGR images to RGB images so we could interpret them easily.
# Converting the images from BGR to RGB using cvtColor function of OpenCV
#We will do this conversion before data pre-processeing to help with Visual EDA
for i in range(len(images)):
images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
#Plotting an image from each of the classes while printing corresponding label
unique_classes = labels['Label'].unique()
# Create a figure to display 12 images
fig, axes = plt.subplots(3, 4, figsize=(10, 8))
fig.suptitle('Image from Each Class', fontsize=16)
for i, class_name in enumerate(unique_classes):
# Get all images of the current class
class_images = images[labels['Label'] == class_name]
# Randomly select one image from the class
random_image = class_images[random.randint(0, class_images.shape[0] - 1)]
# Plot the selected image
ax = axes[i // 4, i % 4]
ax.imshow(random_image)
ax.set_title(class_name)
ax.axis('off')
plt.tight_layout()
plt.show()
#Countplot of classes to see the data spread/imbalance
sns.countplot(labels['Label'])
plt.xticks(rotation='vertical')
categories = labels['Label'].unique()
fig, axes = plt.subplots(len(categories), 3, figsize=(20, 20))
for i, category in enumerate(categories):
category_images = images[labels['Label'] == category]
for j in range(3):
random_index = np.random.randint(0, len(category_images))
axes[i, j].imshow(category_images[random_index])
axes[i, j].axis('off')
if j == 0:
axes[i, j].set_ylabel(category, fontsize=14)
axes[i, j].set_title(f'{category} image {j+1}', fontsize=10)
plt.subplots_adjust(wspace=0.1, hspace=0.1)
plt.tight_layout()
plt.show()
As the size of the images is large, it may be computationally expensive to train on these larger images; therefore, it is preferable to reduce the image size from 128 to 64.
#Plot an images before resizing the images
plt.imshow(images[3])
#Resize the images
images_decreased=[]
height = 64
width = 64
dimensions = (width, height)
for i in range(len(images)):
images_decreased.append( cv2.resize(images[i], dimensions, interpolation=cv2.INTER_LINEAR))
#Plot the same image after resizing
plt.imshow(images_decreased[3])
Split the dataset
As there are less images in the dataset, we will use 10% of our data for testing, 10% of data for validation and 80% of data for training.
from sklearn.model_selection import train_test_split
X_temp, X_test, y_temp, y_test = train_test_split(np.array(images_decreased),labels , test_size=0.1, random_state=42,stratify=labels)
X_train, X_val, y_train, y_val = train_test_split(X_temp,y_temp , test_size=0.111, random_state=42,stratify=y_temp)
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
# Convert labels from names to one hot vectors using Labelbinariser
from sklearn.preprocessing import LabelBinarizer
enc = LabelBinarizer()
y_train_encoded = enc.fit_transform(y_train)
y_val_encoded=enc.transform(y_val)
y_test_encoded=enc.transform(y_test)
# Normalizing the image pixels
X_train_normalized = X_train.astype('float32')/255.0
X_val_normalized = X_val.astype('float32')/255.0
X_test_normalized = X_test.astype('float32')/255.0
# Clearing backend
from tensorflow.keras import backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
# Intializing a sequential model
model = Sequential()
# Adding first conv layer with 64 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension of images
model.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
# Adding max pooling to reduce the size of output of first conv layer
model.add(MaxPooling2D((2, 2), padding = 'same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model.add(MaxPooling2D((2, 2), padding = 'same'))
# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model.add(Flatten())
# Adding a fully connected dense layer with 100 neurons
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.3))
# Adding the output layer with 12 neurons and activation functions as softmax since this is a multi-class classification problem
model.add(Dense(12, activation='softmax'))
# Using Adam Optimizer
opt=Adam()
# Compile model
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy','recall','f1_score','precision'])
# Generating the summary of the model
model.summary()
history_1 = model.fit(
X_train_normalized, y_train_encoded,
epochs=50,
validation_data=(X_val_normalized,y_val_encoded),
batch_size=32,
verbose=2
)
plt.plot(history_1.history['accuracy'])
plt.plot(history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
loss,accuracy,recall,f1_score,precision = model.evaluate(X_test_normalized, y_test_encoded, verbose=2)
print(f'Test Loss: {loss}')
print(f'Test Accuracy: {accuracy}')
print(f'Test Recall: {recall}')
print(f'Test f1_score: {f1_score}')
print(f'Test Precision: {precision}')
# Here we would get the output as probablities for each category
y_pred=model.predict(X_test_normalized)
y_pred
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
from sklearn.metrics import classification_report
# Generate classification report
report = classification_report(y_test_arg, y_pred_arg)
print(report)
Observations
The Accuracy plot shows steady increase in training accuracy indicating model is learning well from training data
Validation accuracy improves but shows some fluctuations towards in end and t a slight decrease suggesting overfitting after 40th Epoch
Overall Accuracy is 0.65 which is moderate and can be improved .Recall (0.65), f1_score(0.63) and precision (0.65) can be improved too
CF Matrix/Classification Report
Class 0: Precision: 0.00, Recall: 0.00, F1-Score: 0.00, Support: 26 Indicates poor performance
Class 1: Precision: 0.76, Recall: 0.79, F1-Score: 0.78, Support: 39 Indicates good performance
Class 6: Precision: 0.55, Recall: 0.89, F1-Score: 0.68, Support: 65 High recall but moderate precision, indicating the model identifies most instances of Class 6 but includes some false positives
Reducing the Learning Rate:
Hint: Use ReduceLRonPlateau() function that will be used to decrease the learning rate by some factor, if the loss is not decreasing for some time. This may start decreasing the loss at a smaller learning rate. There is a possibility that the loss may still not decrease. This may lead to executing the learning rate reduction again in an attempt to achieve a lower loss.
Remember, data augmentation should not be used in the validation/test data set.
# Clearing backend
from tensorflow.keras import backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
# All images to be rescaled by 1/255.
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
# Intializing a sequential model
model1 = Sequential()
# Adding first conv layer with 64 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension images
model1.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
# Adding max pooling to reduce the size of output of first conv layer
model1.add(MaxPooling2D((2, 2), padding = 'same'))
# model.add(BatchNormalization())
model1.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model1.add(MaxPooling2D((2, 2), padding = 'same'))
model1.add(BatchNormalization())
# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model1.add(Flatten())
# Adding a fully connected dense layer with 100 neurons
model1.add(Dense(16, activation='relu'))
model1.add(Dropout(0.3))
# Adding the output layer with 12 neurons and activation functions as softmax since this is a multi-class classification problem
model1.add(Dense(12, activation='softmax'))
# Using SGD Optimizer
# opt = SGD(learning_rate=0.01, momentum=0.9)
opt=Adam()
# Compile model
model1.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy','recall','f1_score','precision'])
# Generating the summary of the model
model1.summary()
from tensorflow.keras.callbacks import ReduceLROnPlateau
# Epochs
epochs = 50
# Batch size
batch_size = 32
reduce_lr = ReduceLROnPlateau(
monitor='val_loss', # Metric to monitor
factor=0.1, # Factor by which the learning rate will be reduced, new_lr = lr * factor
patience=5, # Number of epochs with no improvement after which learning rate will be reduced
min_lr=0.00001, # Lower bound on the learning rate
verbose=1 # Verbosity mode
)
history_2 = model1.fit(train_datagen.flow(X_train_normalized,y_train_encoded,
batch_size=batch_size,
seed=42,
shuffle=False),
epochs=epochs,
steps_per_epoch=X_train_normalized.shape[0] // batch_size,
validation_data=(X_val_normalized,y_val_encoded),callbacks=[reduce_lr],
verbose=1)
plt.plot(history_2.history['accuracy'])
plt.plot(history_2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
loss,accuracy,recall,f1_score,precision = model1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
# Here we would get the output as probablities for each category
y_pred=model1.predict(X_test_normalized)
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
# Generate classification report
report2 = classification_report(y_test_arg, y_pred_arg)
print(report2)
Observations
The Training accuracy shows steady increase reaching upto 66%.Validation accuracy increases more rapidly and stabilises indicating model is generalising well
CF : Most of the values are concentrated along diagonals indicating good performance with many correct predictions,however model tends to confuse certain classes with others
Classes 1,3,6,7 show strong performance with high precision and recall.Class 0 shows the poorest perfoamnce
Comment on the final model you have selected and use the same in the below code to visualize the image.
pd.DataFrame({'Models':['Base CNN Model','CNN Model with Data Augmentation'], 'Accuracy':['65%','66%'],'Recall':['65%','66%',],'Precision':['65%','67%'],'F1_Score':['63%','65%']})
Final Model Selection
Comparison and Recommendation
Accuracy: The CNN Model with Data Augmentation has a slight edge with 66% compared to 65% for the Base CNN Model.
Recall: Similar improvement with the Data Augmentation model showing 66% recall.
Precision: Higher precision of 67% for the Data Augmentation model versus 65% for the Base model.
F1 Score: An improvement to 65% from 63%.
Conclusion
Based on these performance metrics, the CNN Model with Data Augmentation demonstrates overall better performance across all evaluated metrics (accuracy, recall, precision, and F1 score). Therefore, the CNN Model with Data Augmentation is the better model and is recommended for further use.
# Visualizing the predicted and correct label of images from test data
plt.figure(figsize=(2,2))
plt.imshow(X_test[2])
plt.show()
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[2].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[2]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[33])
plt.show()
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[33].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[33]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[36])
plt.show()
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[36].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[36])
plt.figure(figsize=(2,2))
plt.imshow(X_test[5])
plt.show()
print('Predicted Label', enc.inverse_transform(model1.predict((X_test_normalized[36].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[5])
Observation
Actionable Insights
Overfitting Reduction : By using techniques like regularisation, dropout,Early stopping to improve model generalisation.
Data Augmenttaion and Hyperparamemeter Tuning : Fine Tune the model by using enhanced Data Augmentation and hyper paramater techniques.
Model Performance : Should be further enhanced to increase predictive performance. Also class imbalance should be addressed in minority classes by applying over/undersmpling and weighted class techniques.
BusinessRecommendations
Build and deploy an AI powered seedling identification system for farmers & agricultarists which can enable real-time identification saving a great amount of manual labor and cost. Integrating this with mobile/drone devices can enable remote monitoring and seedling identification in large fields.
Use the insights from the classification system for optimum resource allocations like water,fertilisers,pestisides leading to savaning and sustainable farming practices.
Use the classification system for better plan identification system enabling better crop management and targeted agricultural practices.